35 research outputs found

    A survey on bias in machine learning research

    Full text link
    Current research on bias in machine learning often focuses on fairness, while overlooking the roots or causes of bias. However, bias was originally defined as a "systematic error," often caused by humans at different stages of the research process. This article aims to bridge the gap between past literature on bias in research by providing taxonomy for potential sources of bias and errors in data and models. The paper focus on bias in machine learning pipelines. Survey analyses over forty potential sources of bias in the machine learning (ML) pipeline, providing clear examples for each. By understanding the sources and consequences of bias in machine learning, better methods can be developed for its detecting and mitigating, leading to fairer, more transparent, and more accurate ML models.Comment: Submitted to journal. arXiv admin note: substantial text overlap with arXiv:2308.0946

    Serialization in Object-Oriented Programming Languages

    Get PDF
    This chapter depicts the process of converting object state into a format that can be transmitted or stored in currently used object-oriented programming languages. This process is called serialization (marshaling); the opposite is called deserialization (unmarshalling) processes. It is a low-level technique, and several technical issues should be considered like endianness, size of memory representation, representation of numbers, object references, recursive object connections and others. In this chapter we discuss these issues and give them solutions. We also include a short review of tools currently used, and we showed that meeting all requirements is not possible. Finally, we presented a new C++ library that supports forward compatibility

    Targeted Data Augmentation for bias mitigation

    Full text link
    The development of fair and ethical AI systems requires careful consideration of bias mitigation, an area often overlooked or ignored. In this study, we introduce a novel and efficient approach for addressing biases called Targeted Data Augmentation (TDA), which leverages classical data augmentation techniques to tackle the pressing issue of bias in data and models. Unlike the laborious task of removing biases, our method proposes to insert biases instead, resulting in improved performance. To identify biases, we annotated two diverse datasets: a dataset of clinical skin lesions and a dataset of male and female faces. These bias annotations are published for the first time in this study, providing a valuable resource for future research. Through Counterfactual Bias Insertion, we discovered that biases associated with the frame, ruler, and glasses had a significant impact on models. By randomly introducing biases during training, we mitigated these biases and achieved a substantial decrease in bias measures, ranging from two-fold to more than 50-fold, while maintaining a negligible increase in the error rate

    Pervasive mRNA uridylation in fission yeast is catalysed by both Cid1 and Cid16 terminal uridyltransferases

    Get PDF
    Messenger RNA uridylation is pervasive and conserved among eukaryotes, but the consequences of this modification for mRNA fate are still under debate. Utilising a simple model organism to study uridylation may facilitate efforts to understand the cellular function of this process. Here we demonstrate that uridylation can be detected using simple bioinformatics approach. We utilise it to unravel widespread transcript uridylation in fission yeast and demonstrate the contribution of both Cid1 and Cid16, the only two annotated terminal uridyltransferases (TUT-ases) in this yeast. To detect uridylation in transcriptome data, we used a RNA-sequencing (RNA-seq) library preparation protocol involving initial linker ligation to fragmented RNA-an approach borrowed from small RNA sequencing that was commonly used in older RNA-seq protocols. We next explored the data to detect uridylation marks. Our analysis show that uridylation in yeast is pervasive, similarly to the one in multicellular organisms. Importantly, our results confirm the role of the cytoplasmic uridyltransferase Cid1 as the primary uridylation catalyst. However, we also observed an auxiliary role of the second uridyltransferase, Cid16. Thus both fission yeast uridyltransferases are involved in mRNA uridylation. Intriguingly, we found no physiological phenotype of the single and double deletion mutants of cid1 and cid16 and only minimal impact of uridylation on steady-state mRNA levels. Our work establishes fission yeast as a potent model to study uridylation in a simple eukaryote, and we demonstrate that it is possible to detect uridylation marks in RNA-seq data without the need for specific methodologies

    Gastroesophageal reflux disease - unit description, diagnosis and treatment

    Get PDF
    Many GPs are increasingly dealing with patients complaining of ailments likely to suggest gastro-esophageal reflux disease (GERD). These symptoms include heartburn, abdominal pain, and a feeling of esophageal reflux (regurgitation). GERD is one of the most common gastrointestinal diseases that gastroenterologists meet in their practice (1, 2). In North America the problem is affected from 18.1% to even 27.8% of the population. The situation is similar in Europe, where the proportion of people with reflux symptoms is in the range of 8.8% - 25.9%. Among European countries, the prevalence of GERD symptoms is higher in the north of the continent than in the south. The growing problem of overweight and obesity that makes GERD more and more recognized in the population of children and adolescents (3) is a worrying fact. Interestingly, reflux-related complaints are much less frequent in eastern Asia, affecting only 2.5% -7.8% of the population (4)

    In Vitro Antiproliferative and Antioxidant Effects of Extracts from Rubus caesius

    Get PDF
    The present study was performed to evaluate the effect of different extracts and subfractions from Rubus caesius leaves on two human colon cancer cell lines obtained from two stages of the disease progression lines HT29 and SW948. Tested samples inhibited the viability of cells, both HT29 and SW948 lines, in a concentration-dependent manner. The most active was the ethyl acetate fraction which, applied at the highest concentration (250 μg/mL), decreased the viability of cells (HT29 and SW948) below 66%. The extracts and subfractions were also investigated for antioxidant activities on DPPH and FRAP assays. All extracts, with the exception of water extract at a dose of 250 μg/mL, almost totally reduced DPPH. The highest Fe3+ ion reduction was shown for the diethyl and ethyl acetate fractions. It was more than 6.5 times higher (at a dose 250 μg/mL) as compared to the control. The LC-MS studies of the analysed preparations showed that all samples contain a wide variety of polyphenolics, among which ellagitannins turned out to be the main constituents with dominant ellagic acid, sanguiin H-6, and flavonol derivatives

    Design status of ASPIICS, an externally occulted coronagraph for PROBA-3

    Get PDF
    The "sonic region" of the Sun corona remains extremely difficult to observe with spatial resolution and sensitivity sufficient to understand the fine scale phenomena that govern the quiescent solar corona, as well as phenomena that lead to coronal mass ejections (CMEs), which influence space weather. Improvement on this front requires eclipse-like conditions over long observation times. The space-borne coronagraphs flown so far provided a continuous coverage of the external parts of the corona but their over-occulting system did not permit to analyse the part of the white-light corona where the main coronal mass is concentrated. The proposed PROBA-3 Coronagraph System, also known as ASPIICS (Association of Spacecraft for Polarimetric and Imaging Investigation of the Corona of the Sun), with its novel design, will be the first space coronagraph to cover the range of radial distances between ~1.08 and 3 solar radii where the magnetic field plays a crucial role in the coronal dynamics, thus providing continuous observational conditions very close to those during a total solar eclipse. PROBA-3 is first a mission devoted to the in-orbit demonstration of precise formation flying techniques and technologies for future European missions, which will fly ASPIICS as primary payload. The instrument is distributed over two satellites flying in formation (approx. 150m apart) to form a giant coronagraph capable of producing a nearly perfect eclipse allowing observing the sun corona closer to the rim than ever before. The coronagraph instrument is developed by a large European consortium including about 20 partners from 7 countries under the auspices of the European Space Agency. This paper is reviewing the recent improvements and design updates of the ASPIICS instrument as it is stepping into the detailed design phase

    Self-Supervised Learning to Increase the Performance of Skin Lesion Classification

    No full text
    To successfully train a deep neural network, a large amount of human-labeled data is required. Unfortunately, in many areas, collecting and labeling data is a difficult and tedious task. Several ways have been developed to mitigate the problem associated with the shortage of data, the most common of which is transfer learning. However, in many cases, the use of transfer learning as the only remedy is insufficient. In this study, we improve deep neural models training and increase the classification accuracy under a scarcity of data by the use of the self-supervised learning technique. Self-supervised learning allows an unlabeled dataset to be used for pretraining the network, as opposed to transfer learning that requires labeled datasets. The pretrained network can be then fine-tuned using the annotated data. Moreover, we investigated the effect of combining the self-supervised learning approach with transfer learning. It is shown that this strategy outperforms network training from scratch or with transfer learning. The tests were conducted on a very important and sensitive application (skin lesion classification), but the presented approach can be applied to a broader family of applications, especially in the medical domain where the scarcity of data is a real problem
    corecore